sluggyfandomcom-20200214-history
Talk:Halloween (story)
Mosp, Isp, Osp We'll need to add Isp and Osp to the list of tracked characters later. I'll probably see to it when I know how far ahead Greendemon jumps. Since the asps always appear together even before joined with Mosp, they should get a joint article, and from my understanding, "Isp & Osp" shouldn't confuse NiftyBot the same way as "Isp and Osp" would. Eques Concordia 21:51, 7 January 2009 (UTC) : That is correct, NiftyBot shouldn't get confused about "Isp & Osp". However, I'm thinking I should just rectify that weakness in NiftyBot. Basically, it looks for commas or the word "and" in a noun clause to split the character list. (If you're familiar with regular expressions, it uses \s*(,|and)\s+ to search for them.) What NiftyBot fails to notice is whether or not those delimiters appear inside a character name. : Now, for unlinked names, there's really no way for it to know, but linked names are a different story; if a delimiter appears inside double brackets, it ought to smart enough to recognize that and ignore it as a delimiter. : There are a few ways I could fix this. The preferred way would be to come up with a regular expression that would account for links, but I'm not sure how to write one that would be that smart. Another would be to check for character names that have open double brackets without closing brackets, and then check to see if the closing brackets are found at end of a later character name. Yet another would be to dump the regular expression entirely and just write code to slice up the noun clause myself. : I'd welcome assistance from anyone out there with regular expression chops. AlternateTorg 22:25, 7 January 2009 (UTC) :: From a readability perspective, I think "Isp and Osp" would be preferable. However, I'm not well versed enough in regular expressions to help there. I understand that there is a way to make sure that a regular expression only matches a minimum of text, which would be essential for the first option, but that could still get something like Riff and Torg, as far as I understand. Eques Concordia 23:08, 7 January 2009 (UTC) ::One idea would be to split with the regular expression you're using, then within each name, check for without a matching , maybe something like \[\^\* (not tested, the escapes may be tricky). If you find that case, the "and" you just split on was inside brackets, and you need to link up those elements. Is the code available somewhere? I'd like to take a look at it, just out of curiousity. Greendemon 01:21, 8 January 2009 (UTC) ::: That's essentially idea #2 that I posted above. The only problem with that approach is that I'd have to go back to the original sentence in order to see what's joining the nouns; it could be "and" or a comma. ::: I've been wanting to get the source code published, for a number of reasons: 1) If something happens to me, the code is available for others to use; 2) people can see what it's doing and, if they happen to have some Java skills, can suggest improvements; and 3) for transparency reasons, so people can see that it's not malicious. I'm figuring I can just ZIP it up and upload it to the wiki; the wiki has a 5 MB limit for files, but the ZIP is only about 27K (not including dependencies). Before I do that, though, I'd like to go through the code and make sure I've properly documented everything. I've been trying to be good about documenting as I go, but I'm sure there are some areas that can be improved. I'll try and get that done soon. ::: Speaking of which, anyone know what an appropriate license for the source would be? It looks like all NiftyBot's dependencies use an Apache-style license (mostly version 2.0). I wouldn't distribute the dependencies with the ZIP; I'd just provide links to go get them. I'd like to ensure that the license will allow anyone to take over should I for some reason not be able to run NiftyBot anymore. AlternateTorg 18:13, 8 January 2009 (UTC) ::: Update: NiftyBot's binary and source distributions are now available for anyone to download. Have a gander if you like! ::: Update 2: NiftyBot now tolerates commas and the word "and" in linked names. For the curious, I solved the problem by not splitting immediately on the delimiters, and instead searching for links as well, then stepping through the list of delimiters and determining whether or not the current delimiter is found inside a link. If so, it is ignored; otherwise, the text between the previous unlinked delimiter (or the beginning of the word) and the current delimiter is considered to be a noun. I went ahead and added Isp and Osp to this story, but will need to be updated. ::::Thanks for that. I'll try to go through the other stories as I get the the time, then. Eques Concordia 21:49, 19 January 2009 (UTC)